Disturbance Injection Under Partial Automation: Robust Imitation Learning for Long-Horizon Tasks
نویسندگان
چکیده
Partial Automation (PA) with intelligent support systems has been introduced in industrial machinery and advanced automobiles to reduce the burden of long hours human operation. Under PA, operators perform manual operations (providing actions) that switch automatic/manual mode (mode-switching). Since PA reduces total duration operation, these two action mode-switching can be replicated by imitation learning high sample efficiency. To this end, letter proposes Disturbance Injection under (DIPA) as a novel framework. In DIPA, actions (in mode) are assumed observables each state used learn both policies. The above is robustified injecting disturbances into operator's optimize disturbance's level for minimizing covariate shift PA. We experimentally validated effectiveness our method long-horizon tasks simulations real robot environment confirmed outperformed previous methods reduced demonstration burden.
منابع مشابه
DART: Noise Injection for Robust Imitation Learning
One approach to Imitation Learning is Behavior Cloning, in which a robot observes a supervisor and infers a control policy. A known problem with this “off-policy” approach is that the robot’s errors compound when drifting away from the supervisor’s demonstrations. On-policy, techniques alleviate this by iteratively collecting corrective actions for the current robot policy. However, these techn...
متن کاملIterative Noise Injection for Scalable Imitation Learning
One approach to Imitation Learning is Behavior Cloning, in which a robot observes a supervisor and infers a control policy. A known problem with this “off-policy” approach is that the robot’s errors compound when drifting away from the supervisor’s demonstrations. On-policy, techniques alleviate this by iteratively collecting corrective actions for the current robot policy. However, these techn...
متن کاملTruncated Horizon Policy Search: Combining Reinforcement Learning & Imitation Learning
In this paper, we propose to combine imitation and reinforcement learning via the idea of reward shaping using an oracle. We study the effectiveness of the nearoptimal cost-to-go oracle on the planning horizon and demonstrate that the costto-go oracle shortens the learner’s planning horizon as function of its accuracy: a globally optimal oracle can shorten the planning horizon to one, leading t...
متن کاملTruncated Horizon Policy Search: Combining Reinforcement Learning & Imitation Learning
In this paper, we propose to combine imitation and reinforcement learning via the idea of reward shaping using an oracle. We study the effectiveness of the nearoptimal cost-to-go oracle on the planning horizon and demonstrate that the costto-go oracle shortens the learner’s planning horizon as function of its accuracy: a globally optimal oracle can shorten the planning horizon to one, leading t...
متن کاملRobust and Incremental Robot Learning by Imitation
In the last years, Learning by Imitation (LbI) has been increasingly explored in order to easily instruct robots to execute complex motion tasks. However, most of the approaches do not consider the case in which multiple and sometimes conflicting demonstrations are given by different teachers. Nevertheless, it seems advisable that the robot does not start as a tabula-rasa, but re-using previous...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE robotics and automation letters
سال: 2023
ISSN: ['2377-3766']
DOI: https://doi.org/10.1109/lra.2023.3260586